On Tighter Inequalities for Efficient Similarity Search in Metric Spaces

نویسندگان

  • Tao Ban
  • Youki Kadobayashi
چکیده

Similarity search consists of the efficient retrieval of relevant information satisfying user formulated query conditions from a database with prebuilt indexing structures. Since the evaluation of the distance functions between queries and indexed objects is often computationally expensive, there have been many attempts to build indexing structures that use as few distance computations as possible to answer queries. Among these methods, for 20 years the Approximating and Eliminating Search Algorithm (AESA) has been the baseline in terms of the required distance computations. By storing a pre-computed inter-object distance matrix, AESA is able to extensively apply the triangle-inequality based pruning rules to avoid unnecessary distance computations. In this paper, to further improve the performance of AESA, we introduce a novel group of pruning rules that are proven to be tighter than the triangleinequality based rules and hence can further reduce the number of distance computations during the search. The new pruning rules require the assumption of positive semi-definite metric space models and can be used in most modern applications. With some slight modification, they can be easily extended to search algorithms in general metric spaces. In the simulations, when incorporated with the proposed pruning rules, AESA showed a significant improvement in distance-computation reduction. For low dimensional problems, applying the new pruning rules cut the distance computations in half, and for high dimensional problems, the reduction was sometimes more than 90%. The pruning rules were also applied to LAESA, a variant of AESA which imposes a linear storage requirement. For this algorithm, they not only helped to save more distance computations, but considerably reduced the storage requirement as well.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extensions of Some Fixed Point Theorems for Weak-Contraction Mappings in Partially Ordered Modular Metric Spaces

The purpose of this paper is to establish fixed point results for a single mapping in a partially ordered modular metric space, and to prove a common fixed point theorem for two self-maps satisfying some weak contractive inequalities.

متن کامل

On the Monotone Mappings in CAT(0) Spaces

In this paper, we first introduce a monotone mapping and its resolvent in general metric spaces.Then, we give two new iterative methods  by combining the resolvent method with Halpern's iterative method and viscosity approximation method for  finding a fixed point of monotone mappings and a solution of variational inequalities. We prove convergence theorems of the proposed iterations  in ...

متن کامل

Indicator of $S$-Hausdorff metric spaces and coupled strong fixed point theorems for pairwise contraction maps

In the study of fixed points of an operator it is useful to consider a more general concept, namely coupled fixed point. Edit In this paper, by using notion partial metric, we introduce a metric space $S$-Hausdorff on the set of all close and bounded subset of $X$. Then the fixed point results of multivalued continuous and surjective mappings are presented. Furthermore, we give a positive resul...

متن کامل

K-medoids LSH: a new locality sensitive hashing in general metric space

The increasing availability of multimedia content poses a challenge for information retrieval researchers. Users want not only have access to multimedia documents, but also make sense of them the ability of finding specific content in extremely large collections of textual and non-textual documents is paramount. At such large scales, Multimedia Information Retrieval systems must rely on the abi...

متن کامل

New Approaches to Similarity Searching in Metric Spaces

Title of dissertation: NEW APPROACHES TO SIMILARITY SEARCHING IN METRIC SPACES Cengiz Celik, Doctor of Philosophy, 2006 Dissertation directed by: Professor David Mount Department of Computer Science The complex and unstructured nature of many types of data, such as multimedia objects, text documents, protein sequences, requires the use of similarity search techniques for retrieval of informatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008